Skip to content

Conversation

@bradley-erickson
Copy link
Collaborator

@bradley-erickson bradley-erickson commented Apr 18, 2025

Added a generic rate limiting decorator.
TODO

  • Pull rate limits from settings
  • Figure out best way to handle this on the dashboards (what do we say)

@bradley-erickson
Copy link
Collaborator Author

This is merging into a branch (dashboard updates) right now since it was easier for my environment to work on this way. Once that other branch gets merged in, this could get merged into master.

async def check_rate_limit(user_id):
'''Reusable rate limiter with service-specific settings'''
# TODO fetch from pmss/define appropriate window
max_requests = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something more descriptive.

max_user_requests_per_window



def create_rate_limiter(service_name):
'''Factory function for rate limiters with closure over service name'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That repeats the function name. I would describe what the rate limiter does (e.g. max # of requests per window). Basically, autogen docs should tell me what this does.

now = time.time()

# Initialize user/service tracking
key = f'rate_limit:{service_name}:{user_id}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limiter_key

RATE_LIMITERS[key] = collections.deque()

# Expire old requests
timestamps = RATE_LIMITERS[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request_timestamps, maybe? or prior_requests_timestamps?



def rate_limited(service_name):
'''Decorator for async functions needing rate limiting'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should clearly document what kinds of functions this can wrap, and the protocol. Things like this:

            if 'runtime' not in kwargs:
                raise TypeError(f'`{func.__name__}` requires `runtime` keyword argument for checking rate limits.')

Should be in the docstring. Basically, I want to know how I use this.


runtime = kwargs['runtime']

check_limit = create_rate_limiter(service_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_rate_limit would be slightly nicer.

user = await learning_observer.auth.get_active_user(request)
user_id = user[learning_observer.constants.USER_ID]
if not await check_limit(user_id):
raise PermissionError(f'Rate limit exceeded for {service_name} service')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually a bit confused when I'd want this behavior. It'd be helpful to know when this is used.

Seems like what I want:

  1. Queue up requests
  2. If they go obsolete (e.g. user navigates away), drop them
  3. If they don't, let the user know we're throttling and throttle them

Simply failing seems like it might be annoying to the user.

@pmitros
Copy link
Contributor

pmitros commented Apr 23, 2025

I did a review. The code is fine on a code level, but I'm concerned on an algorithm level. I think what we want is:

  1. Have a rate limit.
  2. Keep a finite number of requests in parallel. A lot of this is for LLMs, where we might want e.g. a maximum of 5 parallel calls to OpenAI
  3. We want some overall throttle (e.g. 30 requests per minute)
  4. If users go over that, simply let them know it will take a while since they're over the rate limit; run when they're dethrottled
  5. If they navigate away or e.g. submit many times, drop obsolete requests in-queue which haven't gone out yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants